PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation

نویسندگان

چکیده

Audio tagging is an active research area and has a wide range of applications. Since the release AudioSet, great progress been made in advancing model performance, which mostly comes from development novel architectures attention modules. However, we find that appropriate training techniques are equally important for building audio models with but have not received they deserve. To fill gap, this work, present PSLA, collection can noticeably boost accuracy including ImageNet pretraining, balanced sampling, data augmentation, label enhancement, aggregation their design choices. By EfficientNet these techniques, obtain single (with 13.6M parameters) ensemble achieve mean average precision (mAP) scores 0.444 0.474 on respectively, outperforming previous best system 0.439 81M parameters. In addition, our also achieves new state-of-the-art mAP 0.567 FSD50K.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Labeling Clusters - Tagging Resources

In order to support the navigation in huge document collections efficiently, tagged hierarchical structures can be used. Often, multiple tags are used to describe resources. For users, it is important to correctly interpret such tag combinations. In this paper, we propose the usage of tag groups for addressing this issue and an algorithm that is able to extract these automatically for text docu...

متن کامل

Joint Segmentation and Tagging with Coupled Sequences Labeling

A Segmentation and tagging task is the fundamental problem in natural language processing (NLP). Traditional methods solve this problem in either pipeline or joint cross-label ways, which suffer from error propagation and large number of labels respectively. In this paper, we present a novel joint model for segmentation and tagging, which integrates two dependent Markov chains. One chain is use...

متن کامل

Collaborative Tagging and Persistent Audio Conversations

Our aim in this paper is to outline a fruitful collaboration between the discipline of CSCW and the technologies of Web 2.0 within the scope of the enterprise. The collaboration is two-way: the technologies of Web 2.0 can be used to build systems that enhance collaborative work in the enterprise and the methodology of CSCW can be used to understand how people use Web 2.0 technologies and how th...

متن کامل

Improving Text Clustering with Social Tagging

In this paper we study the use of social bookmarking to improve the quality of text clustering. Recently constrained clustering algorithms have been presented as a successful tool to introduce domain knowledge in the clustering process. This paper uses the tags saved by the users of Delicious to generate non artificial constraints for constrained clustering algorithms. The study demonstrates th...

متن کامل

Improving neural tagging with lexical information

Neural part-of-speech tagging has achieved competitive results with the incorporation of character-based and pre-trained word embeddings. In this paper, we show that a state-of-the-art bi-LSTM tagger can benefit from using information from morphosyntactic lexicons as additional input. The tagger, trained on several dozen languages, shows a consistent, average improvement when using lexical info...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2021

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2021.3120633